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ABSTRACT 


\ 

The  problem  is  to  search  for  the  t  largest  observations  in 
a  random  sample  of  size  n  by  asking  binary  type  questions  of  the 
people  (or  items)  in  the  sample  without  collecting  any  exact  data 
vdiatever.  The  unordered  and  ordered  cases  are  both  considered;  in 
the  latter  case  the  complete  ranking  is  of  special  interest.  Two 
different  criteria  of  optimality  are  considered:  (l)  to  minimize 
the  expected  number  of  questions  required  and  (2)  to  maximize  the 
probability  of  terminating  the  search  in  at  most  r  questions  for 
specified  r.  Optimal  procedures  are  found  and  compared  and  in  some 
sense  the  solutions  for  these  two  criteria  are  close  to  each  other. 
The  analysis  is  nonpcurametric  in  the  sense  that  it  holds  for  any 
underlying  sampling  distribution  but  the  actual  optimal  procedures 
depend  on  the  specified  distribution . 
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1.  Introduction 


This  problem  originated  In  some  research  on  the  optimal  design 
of  organizations,  thou^  It  clearly  has  many  other  applications.  Consider 
the  sln^lest  problem  of  resource  allocation.  In  which  there  Is  one  input 
to  be  allocated  among  many  possible  users.  All  users  produce  the  same 
one  product,  and  each  Is  cheuracterlzed  by  output-input  ratio  Independent 
of  the  scale  of  operations.  Optimal  resource  allocation  would  require  allo¬ 
cating  the  entire  Input  to  the  user  with  the  highest  output-input  ratio. 

Svqpx>c>se  there  are  a  large  number  of  users.  In  the  first  Instance, 
each  user  knows  his  or  her  own  ratio  only,  while  the  center  (the  agent 
performing  the  allocation)  does  not.  The  center  must  acquire  l^he 
information  by  asking  questions  of  the  users.  However,  In  the  spirit 
of  Information  theory,  the  more  exact  the  required  answer  the  more 

costly  its  transfer  Is.  We  can  reduce  the  problem  to  that  of  asking 
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dichotomous  questions.  Since  the  center'  does  not  know  the  individual 
values  of  the  output-input  ratio,  it  may  treat  them  as  members  of  a 
random  sample  from  a  distribution.  In  this  paper,  we  assume  the  distri¬ 
bution  known . 

There  are  many  other  situations  in  which  the  choice  of  the 
largest  element  from  a  sample  may  reasonably  be  made  by  binary-type 
questlcns.  For  instance  the  data  to  be  collected  has  a  confidential 
or  semi-confidential  nature  and  people  may  be  reluctant  to  furnish 
information  about  their  age  or  salary  or  e'ven  about  the  amount  of  money 
presently  in  their  wallet.  On  the  other  hand,  people  may  be  willing 
to  state  that  the  quantity  X  in  question  (i.e.,  X  equals  their  age) 
is  greater  them  30  and,  if  necessary,  later  tell  you  that  it  is  under 
U5,  etc.  The  problem  (or  goal)  is  to  continue  such  questions  with 
the  n  people  (or  a  subset  of  them)  in  order  to  find  the  one  whose 
X-characteristic  is  the  most  extreme  in  a  given  direction  (say,  the 
largest).  A  more  general  goal  would  be  to  find  the  t  laurgest  (smallest) 
of  n  with  or  -without  respect  to  order.  For  the  latt«r  goal,  the  case 
t*n-l  (or  n)  wovild  correspond  to  a  complete  ordering  of  the  people 
in  the  sample  and  this  is  an  Important  special  case.  Clearly,  the  case 
t  =  1  reduces  to  the  former  goal. 

The  criterion  to  be  used  has  to  be  specified  exactly  in  order 
to  either  find  the  optimal  procedure  or  decide  whether  a  gl-ven  procedure 
is  optimal.  The  main  criterion  of  interest  in  this  paper  is  the  expected 
number  of  questions  that  has  to  be  asked.  Ve  are  also  interested  in 


criteria  such  as  "maximizing  the  probability  of  terminating  in  r 
steps.”  For  most  of  our  goals,  the  latter  criterion  with  r  =  1  and 
the  main  criterion  (expect ion)  give  results  that  are  in  "close  proximity 
from  the  point  of  vlev  of  applications. 

Several  things  should  be  noted; 

(1)  Ve  are  not  allowing  paired  comparisons  here;  ve  do  con^are 
all  x*s  with  a  single  specified  constant  and  call  this 
one  question. 

(2)  We  asstime  in  our  illustrations  that  the  X-characterlstics 
of  the  n  people,  x^,X2*...x^,  are  Independent  and  iden¬ 
tically  distributed  (ild)  (or  at  least  exchangeable)  with 
cdf  F(x) ,  which  la  known  to  us  (the  case  of  unknown  F 
will  be  considered  by  the  authors  in  a  sepaarate  publication) 

(3)  We  asstime  that  the  observations  (x's)  are  continuous  so 
that  with  probability  one  we  can  assert  that  no  two  are 
exactly  equal.  We  recognize  that  this  may  not  be  strictly 
true  in  the  applications  noted  above  and  that  practiceil 
modifications  will  be  necessary  to  handle  ties  (e.g.,  two 
people  may  both  be  45  years  old  and  the  data  available  to 
us  does  not  give  ages  finer  than  to  the  nearest  year) . 
However,  the  theoretical  analysis  will  not  take  this  into 
account;  it  8liiq)ly  uses  the  fact  that  with  probability  one 
under  very  weak  restrictions  (independence  being  more  than 
sufficient)  no  two  x*s  will  be  exactly  equal. 


Remark ;  Moreover,  there  will  usually  he  a  practical  lower 


hound  to  the  fineness  of  the  data,  say  e,  that  encourages  ties  for 
large  sample  sizes.  If  we  expect  ties  in  the  sample  we  modify  our 
procedure  hy  not  cdlowlng  in  our  questions  (which  are  of  the  form: 

"Is  your  X  larger  than  c")  two  constants  within  e  of  each  other. 
Ihen  it  is  easy  to  show  that  the  res^^lts  we  have  helow  on  expectation 
are  upper  hoimds  for  this  new  modified  procedure,  even  if  the  proha- 
hllity  of  ties  is  not  small. 

Our  solutions  (for  the  case  of  known  cdf  F(x))  are  strongly 
dependent  on  the  given  cdf  F(x)  (i.e.,  when  the  time  cdf  F(x)  is 
completely  specified) .  However,  the  solutions  are  nonparametric  in 
the  sense  that  the  instructions  and  tables  needed  to  carry  out  the 
procedures  are  the  same  regardless  of  the  particular  assumed  F(x). 

Thus  our  tables  would  specify  a  value  of  p  and  the  procedure  (at 
the  first  Step)  mi$^t  he  to  solve  F(c)  =  p.  Another  equivalent  way 
of  stating  this  is  that  the  problem  has  been  reduced  to  that  of  the 
uniform  (0,1)  distribution. 

The  results  obtained  are  quite  striking.  Thus  in  the  basic 
Illustration  (t  ■  1)  the  minimal  expected  number  of  questions  required 
is  less  than  2  1/2,  namely  2.U2778.  Hie  result  above  holds  for  any 
starting  sample  size  n  and  for  any  known  cdf  F(x).  The  procedure 
that  maximizes  the  probability  of  terminating  on  the  very  next  step 
(the  second  criterion  with  r  »  l)  has  a  result  not  far  removed,  namely 


the  corresponding  expected  number  of  questions  required  is 

The  optimal  procedure  in  the  latter  sense  is  sinq>ler  because  it  does 

not  require  the  use  of  any  table  of  optimal  p(or  c)-values. 

Of  course,  there  covild  be  some  other  ways  of  asking  group  questions, 
e.g. ,  rather  than  asking  a  binary  type  question  leading  to  a  yes  or  no 
answer  (such  as  "raise  your  hand  if  your  X  is  larger  than  c  and 
do  not  raise  otherwise"),  we  could  allow  questions  with  3  possible 
answers  (such  as  "raise  your  right  hand  (or  red  flag)  if  X  >  Cp* 

your  left  hand  (or  blue  flag)  if  X  <  c^,  and  no  hand  (flag)  otherwise"). 

Then  with  c^^  <  c^  we  can  partition  the  sample  with  one  question  into 

at  most  3  disjoint  sets:  X  <  c^^,  c^  <  X  <  Cg,  X  >  Cg.  In  the  same 

way  questions  with  kQ(>  3)  possible  answers  may  be  allowed  and,  of 

course,  we  shoiild  make  every  attempt  to  use  such  questions  if  we  wish  to 
attain  an  optimal  solution.  (The  reason  for  this  is  that  for  kg  >  k^  >  2 

an  optimal  procedure  with  "kg-way"  questions  generally  gives  better 

results  than  an  optimal  procedure  with  "k^-way"  questions.)  In  the 

illustrations  below  only  bineay  type  questions  are  allowed;  however, 
the  same  approach  could  be  ioq>lemented  in  the  cases  of  more  conqplicated 
"sampling  procedures"  (we  refer  to  the  type  of  question  allowed  as  a 
part  of  our  "sampling  procedure"). 

We  regard  ovir  problem  u  the  partial  or  complete  ordering  of 
a  san^le  without  the  necessity  of  knowing  any  particular  values  of  the 
observations  in  the  sample. 
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In  Section  2  we  consider  the  problem  of  selecting  without 
order  the  t  largest  of  n  observations  in  a  random  sample;  the  same 
problem  with  ordering  is  discussed  at  the  end  of  Section  3.  The  main 
part  of  Section  3  deals  with  the  problem  of  a  complete  ordering  of  the 
sample . 


2.  Selecting  Without  Order  the  t  Largest 

2.1  Preliminaries 

Consider  the  problem  of  selecting  without  order  the  t  largest 
observations  in  a  san5>le  (say,  of  people)  of  size  n  when  the  above 
type  of  sampling  is  available  to  us,  i.e.,  we  can  ask  any  subset  of 
the  n  people  to  each  raise  his  hand  if  (and  only  if)  his  X  >  c, 
where  c  is  at  our  disposal  to  select.  We  terminate  when  (and  only 
when)  we  definitely  have  the  t  largest  separated  from  all  the  others. 
(The  modifications  required  in  the  case  of  ties  will  be  evident  in  the 
light  of  a  remark  made  in  Section  1  above.) 

It  shoxild  be  understood  that  if  we  obtain  a  subset  of  size  k 
which  is  less  than  t  as  a  result  of  the  first  question  then  we  continue 
looking  for  t  -  k  from  the  batch  of  size  n  -  k;  if  k  >  t  then  we 
contin\ie  looking  for  t  from  the  reduced  batch  of  size  k. 

Let  IT.  .  denote  the  probability  that  J  people  out  of  i 

^  »  J 

will  respond  affirmatively  to  a  single  question.  In  fact  these  ir. 

i  »  0 


values  can  in  the  most  general  set-up  depend  on  the  entire  history  of  the 
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procedure.  In  other  words.  If  {u}  denotes  the  space  of  trajectories 

of  a  random  process  associated  with  our  procedure,  then  after  w 

questions  have  been  asked  (or  at  a  moment  w)  ir.  .(u)  =  v  .(o)  ,s  <  w) 

i  ,j  1  ,j  s  = 

for  M  *  i.  However,  using  the  assmnption  that  ir.  ,  >  0  for  J  <  i, 

*  1 ,  j  * 

it  can  be  shown  in  a  manner  completely  analogous  to  that  given  in  a 

book  by  Ross  [1970,  Ch.  6]  that  if  the  X's  are  independent  (or  at 

least  exchangeable),  then  the  optimal  solution  for  our  principal  criterion 

(minimum  expected  nvimber  of  questions)  is  obtained  by  a  stationary 

Markov  decision  procedure,  i.e. ,  with  transition  probabilities  tt. 

^  »J 

which  do  not  depend  on  m. 

As  a  result  of  the  above  we  can  use  the  Markovian  property  to 
write  for  n  >  2  the  basic  equation  for  our  procedure 


where  P  . (r)  denotes  the  probability  of  terminating  in  at  most  r 
n  ,t 

steps  under  a  scheme  described  above  if  we  start  with  n  and  our  goal 
is  to  find  the  top  t  unordered.  The  boundary  condition  is  simply  that 
for  all  t  we  should  have  P.  .  (r)  equal  to  zero  for  r  >  1  and 

equal  to  one  for  r  =  0.  If  we  multiply  both  sides  of  (2.1)  by  r  -  1 

and  sum  from  r  *  1  to  •  (letting  denote  the  expected  nximber 

n 

of  steps  or  questions  required  for  our  procedure  with  n  and  t 
defined  in  the  present  goal),  then  we  easily  obtain 
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(2.2) 


Jat+l  J  J=1  "*J  “"J 


1  -  IT  ~  It 

n,0  n,n 


A  sufficient  condition  for  the  expected  time  of  absorption 
(which  is  equivalent  to  the  expected  number  of  questions  needed  in 
our  problems)  to  be  finite  and  hence  for  the  both  sides  of  (2.2)  to 
be  finite  is  that  the  n.  be  bounded  away  from  zero;  more  precisely 

that  ir.  ,  >  6  for  all  i  and  j  (J  <  t  <  i  <  n)  for  some  6  >  0. 

(This  sufficient  condition  can  be  shown  to  hold  for  our  optimal  procedure.) 

An  alternative  way  to  show  that  are  all  finite  for  the  case 

n 

of  the  optimal  procedure  is  to  come  up  with  some  other  (i.e.,  any) 
procedure  for  which  the  expected  times  of  absoption  are  finite  for  all 
n.  In  fact,  it  will  be  shown  that  an  optimal  procedure  in  the  sense 
of  our  second  criterion  with  r  =  1  yields  finite  values  for  all 


Assuming  now  that  the  transition  probabilities  (i.e.,  the  ir  ,) 

n,j 

are  defined  by  a  finite  ntnnber  v  of  parameters  Pj^(n) . p^(n), 

( s } 

and  that  have  been  fo\ind  for  all  s  <  t  and  i  <  n  -  1,  equation 

(2.2)  can  be  viewed  as  a  recvirrence  relation  from  which  the  ir 

n,J 

and  the  optimal  p^(n) • ,p^(n)  can  be  found  via  minimization  of 
That  defines  an  optimal  procedure  in  the  sense  of  our  basic 


I 


rnmmt 


tm 
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crlterion  (minimizing  the  expectation).  For  our  second  criterion  with 

r  >  2  (i.e. ,  maximization  of  P  .(r)  for  a  specific  r),  we  can 

■  n 

treat  equation  (2.1)  in  an  analogous  manner  assTsnlng  that  the  P.  (t) 

i,s 

have  been  found  for  all  i  <  n,  s  <  t  and  t  <  r  -  1.  For  r  =  1 

the  solution  in  the  latter  case  is  obtained  simply  by  maximization 

of  the  single  coefficient  ir  ,  . 

n,t 

In  the  important  speci£tl  case  where  , . . .  ,x^  are  obtained 

from  continuous  iid  random  variables,  the  transition  probabilities 

ri  i  II  1 

*  ,  are  binomial,  namely  ir  ,  =  (,)p'^(l  -  p  )  ,  where  p  =  1  -  F(c  ) 

n^j  n,j  j  n  n  n  n 

and  Cjj  is  a  vedue  which  defines  the  initial  question  for  the  sample 

size  n.  Therefore,  in  this  setting  the  optimal  solution  is  given  by 

a  sequence  of  values  n>T+l,T<t.  Naturally,  since  the 

search  for  the  top  t  unordered  is  equivalent  to  that  for  the  bottom 
n  -  t  xinordered,  we  can  assiime  that  n  >  2t.  For  the  second  criterion 

with  r  -  1  we  obtain  p  values  maximizing  w  .  =  (?)p  (l  -  p  )  ; 

^n  ®  n,t  t  *^n  ^n  ’ 

it  is  easily  shown  that  the  sequence  =  t/n  is  optimal  in  the  sense 

of  this  criterion.  We  can  see  now  that  v  .  i  t^e~^/tl  as  n  -»■  ® 

n,t 

and  hence  in  the  associated  Markov  chain  the  absorption  time 

<  (llm  ff  )  ^  *  t!e^/t*  and  hence  *  lim  satisfies 

the  same  Inequality.  Thxis  for  each  t  the  quantity  t!(e/t)^  is 


/ 


-lo¬ 
an  upper  bound  on  the  optimal  for  any  n;  this  upper  bound 

holds  both  for  optimality  of  the  second  criterion  with  r  =  1  as 
well  as  for  that  of  the  basic  criterion. 

Now  we  can  use  (2.2)  to  derive  some  results  for  the  optimal 
procedure  with  respect  to  the  basic  criterion. 


2.2  The  Optimsil  Procedure  for  the  Basic  Criterion 


'Hieorem  2.1:  Let  {p^  be  optimal  in  the  sense  of  our  basic 
criterion  and  denotes  the  corresponding  expectations.  Then 

(l)  increases  with  n  (for  any  fixed  t)  and 


n  =  "• 


(2)  t/n  <  p^^^  <  (t  +  l)/(n  +  1)  for  t  <  (n  +  l)/2 

lim  np^^^  =  6.  exists  (t  <  6.  <  t  +  l). 

^  n  t  t 

n-*“ 


Proof:  For  simplicity  let  us  first  take  t  =  1  and  write 
p  instead  of  p^  (or  instead  of  1  -  p  and  P^  instead 


of  .  Then  (2.2)  can  be  written  as 

n 


1  +  i 


i; 
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nhich  defines  an  analytic  function  of  p  on  (0,1),  so  that  for 
the  optimal  p  (l.e.,  p  =  {<ip^(p))/dp  =  0  and  It 

gives  us 

(2.U)  I  (“)p‘^“V‘^“’^(j  -  np)u 

J-2  J 

/  n-1  n-lx/,  ^  1  n-j,.  \/,  n  ^n.-l 

=  n(q  -  p  )^1  +  2.  \j)P  <1  ^  ‘  ^  ^ 

/  n— 1  n— 1 \  /V  /  n— 1  n— 1^ 

=  n(q  -  p  =  n(q  -  p  %  • 

Using  the  Identity  J  -  np  =  nq  -  (n  -  J )  In  the  left-hand  side  of 
(2.U)  we  then  rewrite  (2.1+)  in  the  form 


(2.5) 


I  ( 

J=2 


n-1 

J 


)pJq"'^-Jv 


/  n-1  n-lx 
*  n(q  -  p  )y 


For  any  p  and  the  optimd  ^  from  (2.3)  It  is  clear  that 


(2-<)  »„.i  4  (l  » 

Hence  we  obtain  from  (2.3),  (2.5)  and  (2.6) 


n-l.-l 

q  ) 


(2.7) 


Ii„(p9 


n-1 


-  P  ) 


»„(i  -  p“  -  ,“)  -  vi<i  - 
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which  yields 


(2.8) 


‘'n-l  ^  »‘n 


Now,  in  the  first  line  of  (2.U)  we  expand  n  -  np  as  two 
separate  terms  and  use  (2.3)  on  the  second  s\m.  Equating  this  to  the 
third  term  in  (2.U)  we  obtain,  for  any  p^  that  satisfies  (dp^(p))/dp  =  0, 


(2.9) 


n-1.  i-1  n-j 


-  p„Hi  -  pr^>  =  1 


By  virtue  of  (2.8)  the  right  side  of  (2.9)  is  an  increasing  function 
of  p^  and  hence  by  (2.9)  also,  i.e. ,  if  there  are  two 

solutions  P^.Pg  with  p^  <  p^  then  ^^(Pi)  1  yn(P2^*  Si"ce  p^(p) 

is  an  analytic  function  in  (0,1)  and  tends  to  infinity  as  p  1  from 
the  left  and  also  as  p  0  from  the  ri^t ,  there  cannot  be  two  such 
p-values  that  are  both  local  minima.  Hence  the  minimum  must  be  unique 
and  there  is  no  analytical  maxima  in  (0,l). 

The  next  task  is  to  locate  this  unique  minimum.  Prom  the  first 
and  third  terms  in  (2.U) 


(2.10) 
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We  wish  to  show  that  for  every  n  the  derivative  is  negative  at 
p  ■  1/n  and  positive  at  p  =  2/(n  +  l).  Firstly,  for  p  »  1/n  we 
take  into  account  that  W^(p)  *  ^  ^*2  *  ^  2  (the 

fact  that  Ug  *  2  can  easily  he  derived  either  directly  or  from  the 

associated  Markov  chain).  We  first  separate  the  tern  with  i  ^  2 
in  the  sum  in  the  nvunerator  of  (2.10)  and  then  in  the  remaining  terms 
expand  J  -  np  s  j  .  i  as  two  sepsorate  terms,  thus  obtaining  for 
n  >  2  that  the  numerator  of  (2.10)  for  p  -  1/n  does  not  exceed 


(n  -  l)q“"^ 


f-l“f^,n-l.  J-1  n-J 


-l“f^,n.  i  n-J  n-1  ^  . 

q  I  (Jp  q  -  q  +  p 
J=3  ^ 


(n  -  l)q““^{l  -  w^/2}  <  0 


so  that 


dp  (p) 

I  <  n 

dp  |p  ■  1/n 

In  the  same  way  that  (2.9)  has  been  derived  from  (2.4),  we  can 
use  (2.10)  and  some  algebra  to  show  that 


dPjj(p) 

dp 

■  n{p(l  -  p°  -  q")}’^{[Pj,(p)  -  ’‘n-1^^^^ 

•  (1  -  q“"^)  ♦  ~  ’^n-l^^  • 


± 


-llt- 


We  can  see  now  from  (2.11)  that  if  t  then 

(dp^(p))/dp  >  0  and  with  the  continuity  of  Pjj(p)  in  P  this 

implies  that  p^^^  <  p  if  Pjj(p)  -  ^(p)  t  0.  It  now  remains 

only  to  show  that  the  latter  inequality  holds  for  p  =  2/(n  +  l). 
But  using  (2.2)  and  the  relation  ir  11,  =  ''^ 

yiC  11  ylC 


(2.12) 


(1  -  p"  -  q'^)(l  -  p“~^  -  q““^)[Pjj(p)  -  Pji.i^P^^ 

n-1  n-1  .  _  „  r,  n-1  n-1  il 

=  -pq  -  qp  +2,  ’  P  "  *1  " 

k=2 

.  n-1  /-  n-1  n-lx 

+  np  -  P  -  1  ) 


n  n.,  ,  V 

p  -  q  )(n  -  k) 

nq 


and  the  terms  in  the  sm  in  the  right-hand  side  of  (2.12)  are  non- 
negative  if  k  >  3  and  p  =  2/(n  +  l).  Since  for  k  >  3  we  have 
(1  >  y  =  2,  the  right-hand  side  in  (2.12)  is  not  smaller  than 


n-1  n-1  ^  n  n  n-l.,,  n-1  n-lv 

-pq  -qp  +2{(l-p  -q  -  npq  )(1  -  p  -  q  ) 

-  p  -  q  -  (n  -  1  )pq  )(1  -  P  -  q  )} 

n-1  n-1  .  _  n-2/  , > 

»  -pq  -  qp  +  2pq  (np  -  1) 

and  with  p  »  2/(n  +  l)  the  latter  expression  equals  pq*^~^  -  qp*^~^  > 
This  Implies  that 


(2.13) 


n  * 


< 


2 

n  +  1 
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and  the  equality  holds  only  for  n  *  2.  The  inequality  (2.13) 
enables  us  to  take  limits  in  (2.3)  in  accordance  with  two  subse- 

qaances  converging  respectively  to  lim  and  lim 

(using  the  Poisson  approximation  to  the  binomial  distribution). 

The  Poisson  approximation  and  the  monotonicity  of  i]iq>ly 

that  0^  »  lim  *  lio  ”  ®1  *  ®1  ^  equation 

can  now  be  written  instead  of  (2.3)>  namely 

(2.H)  -  |i  *  e 


vhich  is  more  convenient  for  the  numerical  search  for  6^.  This 

completes  the  proof  for  t  ■  1. 

For  t  >  1  the  proof  of  Theorem  2.1  follows  essenticQ.ly  in 
the  same  manner  as  above  and  hence  we  will  outline  the  result  below. 

Firstly,  two  identities  can  be  derived  from  (2.2)  in  the  same 
manner  u  (2.U)  and  (2.10)  eu?e  derived  from  (2.3),  namely 


(2.15) 


dy^^^p) 

dp 


.nM-lrr..(t) 


(t), 


n{p(l  -  p“  -  q“)r^{[y;^'(p)  -  p;!{(p)Hl  -  q”"^) 


.  n— Ir  (t)/  \  (t)n  w  /  (t— k)  (t— k)\, 

*  P  ^n— 1  ^  ”  ^—1  ”  “  *n— l,k  ^n— k  ”  ^n—k— 1  ^ 


and 


-16- 


(2.16) 


n{q(l  -  q"  -  p“)r^| Vl.k^ 


(t)  (t-1). 

»^k+l  -  ^k  ^ 


n-lr  (^“^)a_\  ('t-l)i  r  ('t)/_\ 

-q  [u„.i  -  Pn-l  J  -  LVn  n-1 


-P-")}  . 


Since  in  the  point  P^^^  we  have  (p) )/dp  =  0,  we  can  see  from 

(2.15)  and  the  fact  that  ^  that 


(2.17)  X  ’'n-l,k^'‘n-k  "  ^n-k-l' 

k»l 

=  -  »i!l(p)3{i  -  p“‘^  -  i“"^)  +  p”"^tyn^^  -  4!i^ 

<  -  q”’^) 


(v)  (v) 

and  under  the  inductional  assumption  that  p^  >  yy_2 

r  <  n  and  v  <  t  (2.17)  implies  that  p^^^  >  P^^|-  Then  we  can 

prove  that  the  differences  p^^j  -  increasing  with  J 

which  in  turn  implies  that  pf*^  -  increases  with  0(0  >  l) 

^▼<X  t— l^Ol  — 

.  (t)  (t-l)  (a)  (a) 

t««».  u„,  - 

For  any  point  p  in  which  (dp^^^^(p))/dp  «  0  we  can  write 


now  from  (2.15)  that 


£ 

r. 

i; 

i: 


I  ? 


!i 


.JL. 
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(2.18) 


4»>(p) 


-  K 


(t) 

n-1 


f*.\  T  ^“-1  »1-1 

-  — H— 

n-i  ,  n-i 


t-1 


u 


(t-k) 


k»l 


n-lpk'^n-k 


1  -  I 

- 

^n-l-k' 


1  -  q 


n-1 


and  from  (2.l6)  that 


(2.19) 


-  W, 


(t)  _ 


1  -  P 


n-1 


n-1  n-1 


+  (  (t-l).l  -  p  -  q° 

*  ^»^-l  -  »^-l  ^  -  n-1 

1  -  P 


n-1 


(t)  (t  -  1), 


Z(  VT.  -  X/\ 

’'n-l.k^»*k+l  “Wk  ^ 


n-1 


1  -  P 


S\;q>po8e  now  that  a  function  |i^^^(p)  decreases  in  the  Interval 
(0,p^^^)  and  increases  in  the  interval  {p^^^,l)  and  that 
v/r  <  p^^^  <  (v  +  l)/(r  ♦  l)  for  all  <  t,  r  <  n-1  and  v  •<  (1  +  r)/2. 

»  I*  ^  S 

Then  the  rifi^t-hand  side  in  (2.l8)  decreases  in  the  interval. 
(0,p^^|)  (because  W^^|(p)  decreases  by  virtue  of  the  inductioned 
(t-k)  (t-k) 

assumption  and  y  .  ~  V  ^  decreases  with  k  due  to  the  statement 

n— jc  n— X— A 


1 


ii 


! 
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above  after  (2.17)).  At  the  same  time,  the  right-hauid  side  in 

(2.19)  increases  in  the  interval  with  the  help 

of  the  arguments  following  (2.9)  we  can  see  that  there  ceuinot  be 

two  such  p-values  to  the  left  of  pl^?  (to  the  right  of  p^*, ^^) 

n-l  n-l 

that  are  both  local  minima.  But  by  virtue  of  the  inductional 
assumption 


P 


(t-1) 

n-l 


< 


» 


which  implies  that  if  there  is  a  local  minimum  in  the  interval 

then  this  minimum  is  unique  for  the  function 
n-l  n-l  n 

in  (0,1).  Prom  (2.15)  we  obtain  now  that 


dy^^^p) 


dp 


=  Jt) 


P  =  P 


n-l 


=  n{p(l  -  p“-  q”)}”^'|(w^^\p)  -  -  <1""^) 

_  (y(^-k)  _  p(t-k)jl 

n-l,k  '  n-k  n-k-l7 


>  n{p(l-p“  -  q”)}“^(n 


(t) 

n 


~  P 


(t) 


n-l 
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so  that  u^^^(p)  increases  In  the  interval  {p^^],l).  In  a  similar 
n  n-l 

manner  we  can  shov  that 


dp 


pap 


(t-l) 

n-l 


<  0 


and  hence  there  is  a  unique  minimum  of  y^^^(p)»  and  that  it  lies 

in  the  interval  ,P^^t )•  then  for  t  >  2  we  can  show  that 

n-l  n— 1  - 

(dji^^^(p))/dp  is  not  positive  in  the  point  p  of  the  intersection 

of  Wn^^(p)  and  ''n-1^^^  (dUp^^(p) )/dp  is  positive  in  the 

point  p  of  intersection  of  p^^^(p)  and  ®°  that 

p <  p^^^  <  p.  The  final  step  of  the  proof  includes  the  direct  veri¬ 
fication  of  the  fact  that  p  >  t/n  and  p  1  (t  +  l)/(n  +  l)  (which 
we  omit)  and  hence 


(2.20) 


t/n  <  p^^^  <  (t  +  l)/(n  +  l)  . 


It  is  easy  to  show  that  the  equality  in  (2.20)  holds  only 
if  n  *  2t.  Indeed,  if  n  =  2t  we  obtain  from  (2.2)  (taking  into 

accoTint  that  P^t-k^  * 


/ 


\ 
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(2.21) 


w^>(p) 


(t) 

^  .  ^-’*t+k^’'2t,t+k  *  ’'2t,t-k^ 

k^l _ _ _ _ 

•.  ^2t  2t 

1  -  p  -  q 


and  the  monotonicity  of  ^  implies  that  p  *  1/2  yields 

the  minimum  for  y^^(p). 

The  proof  of  the  existence  of 


lim  np^^^  =0.  ,  t<e.  <t+l  , 

n  t  —  t 


is  similar  to  that  in  the  case  t  *  1. 


2.3  Approximation  to  the  Optimal  Procedure. 

The  aho-vre  results  enable  us  to  prove  that  the  proximity  of 

t/n  and  p^^^  implies  the  same  for  and  the  latter 

G  n  n 

refers  to  the  second  criterion  discussed  in  Section  2.1  above. 


Theorem  2.2;  There  exists  an  c  >  0 
n  <  «»  and  t  <  (n  +  l)/2 


”(t)  (t)  , 

p  -  p  <  e. 


such  that  for  all 


Proof;  We  can  write  that 


*^n  ’^n  '*^n  *^n  '•’^n  *^n  * 


(2.22) 
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vhere  is  defined  by  (2.2)  with  optimal  u  <  t,  s  <  n  -  1, 

and  p  =  t/n.  The  first  term  in  (2.22)  is  an  "in^jrovement"  introduced 
to  the  optimal  procedure  in  the  sense  of  the  second  criterion  with 

r  *  1  by  substituting  y^'^^  for  the  smaller  values  y^'^^  and  the 

s  s 

second  term  is  an  "improvement”  to  y^  '  due  to  the  optimal  choice 
of  p. 

Let  us  suppose  now  that  the  statement  of  the  theorem  holds 
for  all  u  <  t  -  1.  Then  we  get  from  (2.2l)  (taking  into  account 

that  p^.  =  1/2  and  denoting  sup 

2u-l<s;uit-l  ^  ® 

that 


(2.23) 


*(t)  (t)  “(t)  -it) 

^2t  -’'2t  =’^2t  -»^2t 


r  /“(k)  (k)u  4.  ^ 

^  ^‘'t+k  “  *‘t+k^^’^2t,t+k  ’^2t,t-k^ 

k=l _ 

,  2t  2t 
1  -  p  -  q 


<  e  -  e 

=  t-1  t-1 


1-2 


-2t+l 


The  inequality  above  serves  as  the  basis  for  the  induction. 

If  we  suppose  now  that  the  statement  of  the  theorem  holds  for  all 
u  <  t  and  s  <  n  -  1  with  e  *  e  , ,  then  from  (2.2)  we  can  obtain 

■  s  S-1 


that 
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(2.2k) 


-  e 


-  i)-* 

t  n  n _ _ 

1  -  (i)"  -  (1  .  1)” 

n  n 


In  order  to  estimate  the  second  term  in  (2.22)  we  can  derive  (2.16) 
that  in  the  interval  (t/n,p^*^) 


(2.25) 


dp 


nC^q 


n-1 


,  n  n 
1  -  p  -  q 


where  is  a  constant  for  a  fixed  t  and  <  C^t,  where 

does  not  depend  on  t  and  <  1. 

From  (2.24)  and  (2.25)  we  can  see  now  (using  the  inequality 


-it)  (t)  (t)  t, 

'•n  - 1'''- 


— <P<P 

n=  n 


that 


(2.26) 


and 


'(t) 

‘'n 


(t) 


u  <  e 


_  l)n-t  .  _  l)n-t 

1  -  (J)“  -  (1  -  V 

n  n 


'(t)  (t)  ^ 

u  -  u  <  e 
* 


’  V  • 


or  If 


{2.27) 


Since  a^‘*'0  as  t-^*  the  e  can  always  be  chosen  to 

satisfy  the  statement  of  the  theorem. 

The  calculations  presented  in  the  Table  1  show  that  for 
t  <  7  we  have  6^  ^  ^1  -  since  Og  <  O.OlU  we  have 

®  “  ®1  a  O-Ol*** 

From  Theorem  2.2  and  Table  1  we  can  see  that  from  a  practical  point 
of  view  the  optimal  procedvire  for  the  second  criterion  (with  r  =  l) 
is  nearly  optimal  in  the  sense  of  the  first  criterion.  Clearly,  the 
so-called  t/n-procedure  is  more  convenient  since  it  does  not  require 
specific  tables  for  determining  the  p^^^'s. 


.5 

.34627 

.26557 

.mu 

.02261 

1.14852* 

2.0 

2.16507 

2.23^ 

2.35625 

2.41389 

2.42778 

.5 

.40582 

.34173 

.20969 

.04318 

2.17566 

2.38004 

2.47956 

2.53757 

2.63849 

2.73979 

2.76250 

.5 

.4sii8r 

.38017 

.30683 

.06326 

S.I8865 

2.59500 

2.6637s 

2.70961 

2.76729 

2.91525 

2.94586 

!44657 

.40350 

.27236 

.08324 

4.19669 

2.74003 

2.79158 

2.82^ 

2.92477 

3.03Q50 

3.06868 

.5 

.45608 

.41919 

.35748 

.10317 

5.20228 

2.84750 

2.88^ 

2.91945 

2.98061 

5.11405 

3.15957 

.5 

.116263 

.40348 

.30388 

.12307 

6.20649 

2.93189 

2.96538 

2.99198 

S.06l(%( 

3.17819 

3.23100 

.5 

.46751 

.3S29>» 

.23691 

.14295 

7.20982 

S.OO83 

3.(»909 

3.11390 

3.18262 

3.22936 

3.28^ 

2.15667 

2.2>fl38 

2.S652‘^ 

2.42676 

2.44144 


2.38095 

2.48139 

2.54042 

2.64418 

2.74998 

2.77329 


2.59706 

2.66642 

2.71295 

2.77175 

2.92435 

2.95512 


2.74307 

2.79508 

2.83286 

2.93045 

3.05913 

3.07700 


2.85131 

2.892ii6 

2.92397 

2.98601 

3.12240 

3.16726 


2.93634 

2.97011 

2.99697 

3.08726 

3.18641 

3.23823 


3.00579 

3.05427 

3.12005 

5.18984 

3.23751 

S.2965I 


0 

.00150 

.00355 

.00699 

.01287 

.01366 


.ooeo6 

.00269 

.00834 

.00446 

.00912 

.00926 


.00304 

.00550 

.004^ 

.00568 

.OO8S5 

.00652 


.00445 

.00473 

.00499 

.00623 

.00822 

.00723 


.00496 

.00518 

.0061s 

.00722 

.00615 

.00687 
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3.  The  Complete  Ordering  of  a  Sample 

Let  us  denote  by  the  expected  number  of  required  steps. 

Then  in  the  same  manner  as  above  for  we  can  write 

n 


n-1 


(3.1) 


o„(p)  . 


1  ♦  I  (")pV  (Q  *  0  ) 


r*l 


-  n  n 

1  -  p  -  q 


In  order  to  agree  on  one  definite  procedure  (out  of  many  equivalent 
procediires)  after  the  first  question  has  been  asked »  it  is  understood 
that  we  shall  first  order  the  r  subjects  in  one  of  the  two  subgroups 
formed  and  then  order  the  remaining  n  -  r  subjects.  Thus  the 
minimization  of  (3.1)  will  provide  the  optimal  results  in  the  sense 
of  expectation. 

We  can  show  algebraicedly  that  for  n  <  5  the  optimal  value 
of  p^  is  1/2.  For  many  values  of  n  >  6  it  is  no  longer  true; 

however,  the  procedure  with  p^  »  1/2  for  ell  n  serves  in  this  case 

as  an  approximation  to  the  optimal  procedure.  Just  like  the  "t/n- 
procedure"  from  Section  2  does  in  the  case  of  selecting  without 
order  the  t  Icurgest.  Unlike  the  previous  problem  we  have  not  obtained 
exact  analytic  resvilts  on  the  limiting  approach  of  the  "l/2-procedure” 
to  the  optimal  procedure,  but  we  do  have  considerable  empirical  infor¬ 
mation  about  this  which  we  will  describe  later.  What  we  do  have  is  an 
explicit  upper  bound  for  n  >  6  for  the  optimal  procediire  which  is 


I 


- . J 


-26- 


based  on  the  l/2-procedure.  Furthermore,  the  numerical  results 

in  Table  2  indicate  that  the  difference  between  the  G  -results  for 

n 

p  =  1/2  (denoted  by  G  )  and  the  optimal  G  -value  is  extiremely 

n  n 

small. 


Theorem  3.1:  For  the  l/2-procedure  we  have  the  exact  result 


(3.2) 


n 


Gn  =  1  (-1)  (p(r  -  1)(1  - 

r=2 


1-rx-l  n  -  1 


+  a 


ln2  “n 


where  0  <  <  1/2  (n  =  2,3,...). 


Proof;  From  (3-1)  with  p  =  1/2  we  have 


(3.3) 


2"g^  «  2“  +  *  “°1^  * 


where  we  define  G^  =  G^  =  1  in  order  for  (3-3)  to  hold  for  n  *  0,  1 
Multiplying  through  by  z”/nf,  we  let 


09  G 


n  _n 


y(.)  -  I 

n-0 


and  obtain 


(3.U) 


y(2z)  *  e^*  +  2y(z)e*  -  2  J  ^ 

n-0 


n  +  1  n 
z 


*  +  2y(z)e*  -  2(z  +  l)e*  . 


I' 


Table  2 


Coaiplete  Ordering 


Asyaptotle 


Diffaranee 


.50000 

.50000 

.50000 

.50000 

.53686 

.63306 

.6s>i62 

.6^50i» 

.50000 

.50000 

.50000 

.50000 

.50000 

.50000 

.58998 

.6220^ 

.63860 

.6l»8s8 

.65530 

.6618^ 

.67058 

.68381 

.50000 

.50000 

.50000 

.50000 

.50000 

.50000 

.50000 

.50000 

.55179 

.yrW5 

.5»r99 

.998yr 


.67196 

.678119 

.68690 

.50000 

.50000 


2.00000 

S.3SSS3 

i».76l90 

6.20958 

7.69650 

9.098rk 

10.5«K)59 

11.96319 

13.^2997 

3^.86811 

36.31053 

17.75332 

19.39579 

20.638I»7 

22.06106 

23.58360 

2^.96631 

26.iM66li 

27.85120 

29.29377 

30.73636 

32.178^ 

S3.621fi8 

35.06402 

36.50657 

3r.9'»912 

39.39169 

40.83429 

42.27682 

43.71999 

45.36196 

46.60492 

48.04706 

49.48964 


65.95780 

66.80096 

68.24292 


2.00000 

S.3S3S9 

4.76190 

6.20952 

7.65653 

9.IOOI16 

10.54268 

11.96451 

13.42660 

14.86909 

36.31188 

37.75481 

19.19775 

20.64062 

22.08341 

29.52610 


27.85391 

29.29651 

30.73914 

K.38179 

33.62447 

35.06719 

36.50992 

Sr.95266 

39.39542 

40.85817 

42.28092 

45.72966 

45.36639 

46.60911 

48.05381 

49.49450 


65.36382 

66.80650 

68.24919 

69.691B8 

71.13458 


1 

3 

4 
6 
7 
9 

10 
11.98425 


33 

35 

36 
ST 
39 
40.85815 
42.28085 
43.72355 
45.36625 
46.60894 
48.05363 

49.49433 


.00191 

.00194 

.00195 

.00204 


65 

66 

68.24936 

69 

71.13475 
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Table  2 

Complete  Ordering  (Continued) 


B 

Oiptiaal 

»B 

optlima 

«« 

»« 

Asymptotic 

in  -  ^ 

Dlffsr«ie« 

60 

.50000 

85.55565 

85.56177 

85.96170 

.00805 

6l 

.50000 

86.9^21 

8r.00Mt9 

87.00iAO 

.00819 

62 

.50000 

88.1»5877 

88.1^721 

88.U709 

.00852 

6s 

.50000 

89.88155 

89.88992 

89.88978 

.0081^5 

6k 

.55577 

91.52590 

91.55^ 

91.5S2k9 

.00859 

65 

.57«»8l 

92.766^ 

92.77555 

92.77518 

.00872 

76 

.61995 

IO8.65I162 

108.6W»95 

108.6W>82 

.01020 

77 

.629011 

iio.onriB 

110.08762 

110.08751 

.01055 

78 

.65867 

IU.5197% 

111.55029 

111.55021 

.0101^7 

79 

.6Ii672 

112.96251 

112.97297 

112.97291 

.01060 

80  ; 

.65261 

iik,kck8r 

IIA.A156I 

.0107% 

81 

.65675 

US.Bklks 

115.85851 

11>».858so 

.01087 

82 

,659Sk 

U7.28999 

U7.5009e 

117.50100 

.01101 

85 

.66176 

118.75255 

118.711565 

118.7I»869 

.01Ui% 

90  .1 

1  .67282  1 

1  128.85(*7  I 

1  128.8^258 

1  128.8A255  1 

1  .01208 

-29- 


Multiplying  (3.U)  by  we  let 


e’"y(z)- 

n=0 


and  obtain 


(3.5)  P(2z)  =  2F(z)  +  1  -  2(z  +  l)e"®  . 


•  A  (2z)“  •  A  z“  • 

<3-6i  r-^i — 2  I 1*2  I  (-1) 

“  *  _ n  •  _ /N 


n  (n  -  l)z 
nl 


Equating  coefficients,  we  obtain  for  n  >  2 


(3.7)  A  =  ^ 

"  2°"^  -  1 


ubere  A^  and  A^  will  be  found  later.-  Hence  it  follows  that 


•A_„  »a  »  n  n 

(3.8)  /(«)-  Z5?x»  .  lhlj>r 

B“0  a=0  n=0  r=0 


Using  (3.7)  and  the  definition  of  y(z),  we  obtain 


(3.9)  nX  • 


From  (3.9)  for  n  ■  0  and  1  we  find  that  ■  1  and 


(3.10)  Gj^«1«Aq  +  Aj^«1  +  Aj^^Aj^*0  . 


][ 
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Hence  we  obtain  the  final  result  from  (3.7),  (3-9)  and  (3.10) 


(3.11) 


G  ®  1  +  y  (-i)^(")  ^ 

r=2  ^  2^  -  1 


r=2  *^1-2 


(In  (3.11)  we  used  the  identity 


I  (-l)^(“)(r  -  1)  =  1.) 
r=2 


Asymptotic  evaluation  of 

From  (3.11)  we  also  obtain  for  AG  =  G  -  G  ,  for  n  >  2 

n  n  n-1 


(3.12)  AG 


,  .  f  -  1)  ,n-l,.  _  ”j:°  (-1)°(,  .  1)  a-1 


(i)= 


n-2 


=  (n-1)  I  (-l)®(“:2)[i  +  (i)8+l  +  (l)2s+2  + 
s=0  ®  ^ 


•  (n  -  .  ...) 

.  (n  -  1)  I  ^  (1  -  : 

0=1  2®  2® 


if  we  set  »  1  then  (3.12)  also  holds  for  n  *  2. 

Prom  (3.12),  \islng  the  definition  G^  *  1,  we  obtain 


J 
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(3.13)  G  -  1  -  I  ?  (1  -  1)(1  -  . 

a-1  2  i-2  2 


Letting  6  ■  1  -  (l/2f*)  and  J  ■  i  -  1,  we  can  write  this 
for  n  >  2  as 


(3.U») 


G 


n 


ofO  2“  J«0 


Ve  now  use  the  Euler>MacLaurin  sum  formula  for  (3.lU).  For  the  anad- 
ogous  integral  I,  using  x  for  a  and  letting  y  *  1  -  (l/2)*,  so 
that  dx  »  dy/[(l  -  y)An2]t  we  obtain 


(3.15)  ^  ’  I  (^*li  -  (1  -  -~)“j  -  n(l  -  ~)“"^>dx 

^  2*  2^ 

.  ^  ^  ..  V  t  r  ny°~M 

£n2  J  [  1  -  y  J  ^ 

*  +  2y  +  3y^  +  ...  +  (n  -  l)y“"^leiy  »  ”5^  . 


The  two  correction  terms  for  the  Euler-MacLaurin  sum  formula  yield 
1/2  and  0;  \ising  the  same  analysis  eis  in  (3.15)  it  is  easy  to  show 
that  the  remainder  tern  is  bounded  by  1/2.  Hence  the  aisyiig>totic 
result  for  G_  is 


(3.16) 


G  : 
n 


1 

in2 


1 


1  «  l.U4269n-  1 
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where  the  adjusted  constant  is  based  on  empirical  results.  (For 
n  =  50  this  gives  71.13U5  and  the  exact  result  for  is  71.131*2, 

an  error  of  less  than  1/200  of  1^.  The  optimal  result  for  n  =  50 
is  71.1280,  an  error  in  G^  of  about  l/lOO  of  IJf.) 

In  this  problem  the  best  we  can  do  is  to  point  out  that  we 

need  a  minimum  of  at  least  n  -  1  questions  to  separate  all  the  n 

observations  and  (3*16)  shows  that  for  ”1/2— procedure”  on  the  average 

we  need  only  about  UUJJ  more  than  this  minimum. 

It  should  be  mentioned  here  that  from  a  computational  point  of 

view  the  search  for  the  optimal  solution  in  our  last  setting  represents 

a  significant  problem.  The  problem  is  that  G  as  a  function  of  n 

n 

is  so  closely  approximated  by  a  linear  function  of  n,  and  that  for 
the  fixed  n,  Gjj(p)  is  edmost  constant  so  that  the  search  for  p 

which  yields  the  minimum  to  Gjj(p)  is  quite  difficult.  Thus,  for 

n  i  10  the  variation  of  p  in  the  Interval  [.5».7]  does  not  change 
the  first  two  decimals  in  G^(p).  However,  the  difference 

n(tn2)  ^  -  1  -  G^  (which  show  up  in  the  3*^  decimal)  tend  to  grow  very 

slowly  with  n  so  that  the  correction  term,  namely  the  constant  1 
in  G^  5  n(in2)  ^  -  1,  should  actually  be  larger  than  one,  say  of  the 

form  1  +  where  is  a  very  slowly  Increasing  function  of  n. 

Prom  Table  2  we  empirically  observe  a  cyclic  pattern  for  the  optimal 


p-value  which  ou^t  to  be  described.  The  optimal  p-value  is  always 
between  .5  and  a  constant  that  appears  to  be  close  to  ln2  *  .693... 
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For  2  <  n  <  5  the  optimal  p  is  .5;  for  6  <  n  <  9  It  increases; 

for  10  <  n  <  15  it  is  again  .5;  for  l6  <  n  <  23  it  increases;  for 

2U  ^  n  <  31  it  is  again  .5;  for  32  <  n  <  U8  it  increases;  for 

U9  <  n  <  63  it  is  again  .5  and  it  increases  for  n  >  6U.  For 

large  r  we  conjecture  that  the  optimal  p  will  be  1/2  for 

3  •  2^”^  a  ®  a  ^  that  it  will  increase  between  1/2  and 

some  constant  close  to  tn2  for  2^  £  n  <  3  *  2^  ^  and  that  it  will 
follow  this  type  of  cyclic  pattern  indefinitely.  Furthermore,  we 
conjecture  that  the  small  variation  in  the  optimal  as  p  varies 

from  .5  to  .7  will  persist  for  large  values  of  n,  so  that  the 
l/2-procedure  will  always  give  an  answer  which  is  equal  to  the  optimal 
G^-value  to  2  or  3  decimal  places. 

It  is  interesting  to  mention  here  that  the  natural  generalization 
for  the  problem  of  complete  ordering  of  the  second  optimality  criterion 
from  the  selection  problem  (namely,  maximizing  the  probability  of  a 
complete  ordering  in  n  -  1  steps)  leeids  for  n  >  2  to  the  equation 


(3.17) 


P„(P) 


n-1 
1 
k«l 


n-2 


v  .  P.  P  t 
nk  k  n-k 


IT  .  P.  P  .  + 
nk  k  n-k 


(f 


nl 


v  1  )P  1 

n,n-l  n-1 


k«2 


where  Pj^  denotes  the  probability  of  the  complete  ordering  of  a  sample 
of  size  k  in  k  -  1  steps  and  the  are  binomial  probabilities: 

i 

! 


’nk  *  ”  P)”  n  <  6  the  optimal  p  (which  yields 


the  maxlm\an  for  P^(p))  is  1/2;  in  the  same  way  as  above  we  can 

consider  a  "l/2-procedure"  and  denote  by  the  corresponding 

probabilities  and  by  P^  the  maximal  values  over  p  of  P^(p). 

It  is  easy  to  show  that  (3.17)  in?>lies  for  n  >  2  the  inequalities 

(3.18)  |(|)""^  ^  • 


For  the  problem  of  the  selection  of  the  t  largest  with 
ordering  out  of  n,  we  can  easily  write  the  equation  for  the 
expectation  of  the  number  of  questions; 


(3.19) 


1  +  Tv  ,  (G,  +  0 
.  nk'  k 
k=l _ 


(t-k). 


n-1 


nk'^k  ■  ''n-k  ^  ^  I 


k=t±l 


p(t) 

’'nk^k 


,  n  n 
1  -  p  _  q 


complete  ordering,  ®  min  g'^'(p),  and  w 


where  denotes  the  minimal  expectation  for  the  problem  of 

8u:e  binomial 

p  n  UK 

probabilities  as  before. 

It  is  easy  to  see  that 


.(t)  _ 
"k  “ 


(3.20)  G.  <  g1^^  <  G.  +  , 

where  the  rl£^t-hand  side  corresponds  to  the  expected  total  number 

of  steps  in  the  procedure  "y/G”  in  which  we  first  select  the  t 

leuTgest  and  then  order  them.  Since  G  is  of  order  t  and 

V  n 


L 


is  of  the  order  or  less,  with  large  valties  of  t  the  optimal 
procedure  for  this  problem  of  selection  with  ordering  does  not 
give  the  qualitative  Improvement  over  the  procedure  "y/G" 
described  above.  However,  our  conjecture  is  that  the  optimal 
value  of  for  the  selection  with  ordering  is  between  t/2n 

and  t/n  and  that  there  exists  a  constant  v  such  that 
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